203 research outputs found

    An Efficient Transformer Decoder with Compressed Sub-layers

    Full text link
    The large attention-based encoder-decoder network (Transformer) has become prevailing recently due to its effectiveness. But the high computation complexity of its decoder raises the inefficiency issue. By examining the mathematic formulation of the decoder, we show that under some mild conditions, the architecture could be simplified by compressing its sub-layers, the basic building block of Transformer, and achieves a higher parallelism. We thereby propose Compressed Attention Network, whose decoder layer consists of only one sub-layer instead of three. Extensive experiments on 14 WMT machine translation tasks show that our model is 1.42x faster with performance on par with a strong baseline. This strong baseline is already 2x faster than the widely used standard baseline without loss in performance.Comment: accepted by AAAI202

    Scholarship of Teaching and Learning Based on Learning Experiences and Rewards of College Students: An Investigation From Guangxi, China

    Get PDF
    Learning experiences of college students are the main index for representing scholarship learned by college students which include family backgrounds, supporting campus environment, individual efforts of the students themselves, and social communication, utilization of university resources, study activities, course work and learning rewards. Therein it is the supporting campus environment, study activities, social communication and utilization of university resources that have important influence on learning rewards of college students. For the purpose of analyzing scholarship taught by teachers from the scholarship learned by students, first, we should carry out training on teachers in accordance with scholarship activities and purports of students to fit for the interests and demands thereof; secondly, we should lay stress on construction of campus environment to promote the combination of scholarship both learned by students and taught by teachers, and simultaneously, effectively taking advantage of utilization and development of university resources, to serve for the development of teachers and students all the better; lastly, either scholarship learned by students or taught by teachers needs joint efforts of multiple subjects asthe teachers, students and universities

    Eliciting Knowledge from Large Pre-Trained Models for Unsupervised Knowledge-Grounded Conversation

    Full text link
    Recent advances in large-scale pre-training provide large models with the potential to learn knowledge from the raw text. It is thus natural to ask whether it is possible to leverage these large models as knowledge bases for downstream tasks. In this work, we answer the aforementioned question in unsupervised knowledge-grounded conversation. We explore various methods that best elicit knowledge from large models. Our human study indicates that, though hallucinations exist, large models post the unique advantage of being able to output common sense and summarize facts that cannot be directly retrieved from the search engine. To better exploit such generated knowledge in dialogue generation, we treat the generated knowledge as a noisy knowledge source and propose the posterior-based reweighing as well as the noisy training strategy. Empirical results on two benchmarks show advantages over the state-of-the-art methods.Comment: Accepted to EMNLP 2022 Main Conference. The code is publicly available at https://github.com/lyy1994/PLM_as_KB/tree/main/projects/plm_as_k

    Two-and-a-half Order Score-based Model for Solving 3D Ill-posed Inverse Problems

    Full text link
    Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) are crucial technologies in the field of medical imaging. Score-based models have proven to be effective in addressing different inverse problems encountered in CT and MRI, such as sparse-view CT and fast MRI reconstruction. However, these models face challenges in achieving accurate three dimensional (3D) volumetric reconstruction. The existing score-based models primarily focus on reconstructing two dimensional (2D) data distribution, leading to inconsistencies between adjacent slices in the reconstructed 3D volumetric images. To overcome this limitation, we propose a novel two-and-a-half order score-based model (TOSM). During the training phase, our TOSM learns data distributions in 2D space, which reduces the complexity of training compared to directly working on 3D volumes. However, in the reconstruction phase, the TOSM updates the data distribution in 3D space, utilizing complementary scores along three directions (sagittal, coronal, and transaxial) to achieve a more precise reconstruction. The development of TOSM is built on robust theoretical principles, ensuring its reliability and efficacy. Through extensive experimentation on large-scale sparse-view CT and fast MRI datasets, our method demonstrates remarkable advancements and attains state-of-the-art results in solving 3D ill-posed inverse problems. Notably, the proposed TOSM effectively addresses the inter-slice inconsistency issue, resulting in high-quality 3D volumetric reconstruction.Comment: 10 pages, 13 figure

    Reduced expression of miR-22 in gastric cancer is related to clinicopathologic characteristics or patient prognosis

    Get PDF
    OBJECTIVE: Involvements of microRNA-22 (miR-22) in cancer development have attracted much attention, but its role in tumorigenesis of gastric cancer is still largely unknown. Therefore, the aim of this study was to investigate the expression patterns and clinical implications of miR-22 in gastric cancer. METHODS: Quantitative RT-PCR was performed to evaluate the expression levels of miR-22 in 98 pairs of gastric cancer and normal adjacent mucosa. RESULTS: Compared with normal adjacent mucosa, miR-22 expression was significantly downregulated in gastric cancer tissues (P < 0.001). Of 98 patients with gastric cancer, 58 (59.2%) were placed in the low miR-22 expression group and 40 (40.8%) were placed in the high miR-22 expression group. In addition, tumors with low miR-22 expression had greater extent of lymph node metastasis (P = 0.02) and distant metastasis (P = 0.01), and were at a worse stage (P = 0.01) than the tumors with high miR-22 expression. Moreover, the gastric cancer patients with low miR-22 expression had shorter overall survival than those with high miR-22 expression (P = 0.03). MiR-22, determined by multivariate analysis, was an independent prognostic factor for patients with gastric cancer. CONCLUSION: Our data offer the convincing evidence that the reduced expression of miR-22 was significantly associated with malignant development of gastric cancer and may be a novel prognostic marker of this disease. miR-22 might have potentials in the application of cancer therapy for patients with gastric cancer

    On Effectively Learning of Knowledge in Continual Pre-training

    Full text link
    Pre-trained language models (PLMs) like BERT have made significant progress in various downstream NLP tasks. However, by asking models to do cloze-style tests, recent work finds that PLMs are short in acquiring knowledge from unstructured text. To understand the internal behaviour of PLMs in retrieving knowledge, we first define knowledge-baring (K-B) tokens and knowledge-free (K-F) tokens for unstructured text and ask professional annotators to label some samples manually. Then, we find that PLMs are more likely to give wrong predictions on K-B tokens and attend less attention to those tokens inside the self-attention module. Based on these observations, we develop two solutions to help the model learn more knowledge from unstructured text in a fully self-supervised manner. Experiments on knowledge-intensive tasks show the effectiveness of the proposed methods. To our best knowledge, we are the first to explore fully self-supervised learning of knowledge in continual pre-training

    Skewed X-chromosome inactivation in patients with esophageal carcinoma

    Get PDF
    ABSTRACT: Skewed X-chromosome inactivation (SXCI) was found in some apparently healthy females mainly from Western countries. It has been linked to development of ovarian, breast and pulmonary carcinomas. The present study aimed to observe the SXCI frequencies in apparently healthy Chinese females and patients with esophageal carcinoma. DNA was extracted from the peripheral blood cells from 401 Chinese females without a detectable tumor and 143 female patients with esophageal carcinoma. Exon 1 of androgen receptor (AR) gene was amplified, and the products of different CAG alleles were resolved on denaturing polyacrylamide gels and visualized after silver staining. The corrected ratios (CR) of the products before and after HpaII digestion were calculated. As to the healthy females, when CR ≥ 3 was used as a criterion, SXCI was found in two (4.3%) of the 46 neonates, 13 (7.8%) of the 166 younger adults (16–50 years) and 37 (25.7%) of the 144 elderly females (51–96 years), with the frequency higher in the elderly subjects than in the two former groups (P < 0.05). When a more stringent criterion (CR ≥ 10) was used, SXCI was found in one (2.2%), two (1.2%) and 16 (11.1%) of the subjects in the three age groups, respectively, itsfrequency being higher in the elderly than in the younger age groups (P < 0.05). Occurrence of SXCI was detected in both the patients and controls at similar frequencies. However, the phenomenon, as defined as CR ≥ 3, was more frequent in the patients aging <40 years (35.7%) compared to the corresponding reference group (7.6%, P = 0.006). When CR ≥ 10 was adopted, the frequencies were 7.1% and 1.2%, respectively. Their difference did not attain statistical significance (P = 0. 217). SXCI also occurs in apparently healthy Chinese females, and is associated with age. It may be considered as a predisposing factor for the early development of esophageal carcinoma. VIRTUAL SLIDES: The virtual slide(s) for this article can be found here http://www.diagnosticpathology.diagnomx.eu/vs/154236433792765

    FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows

    Full text link
    Despite recent progress in open-domain dialogue evaluation, how to develop automatic metrics remains an open problem. We explore the potential of dialogue evaluation featuring dialog act information, which was hardly explicitly modeled in previous methods. However, defined at the utterance level in general, dialog act is of coarse granularity, as an utterance can contain multiple segments possessing different functions. Hence, we propose segment act, an extension of dialog act from utterance level to segment level, and crowdsource a large-scale dataset for it. To utilize segment act flows, sequences of segment acts, for evaluation, we develop the first consensus-based dialogue evaluation framework, FlowEval. This framework provides a reference-free approach for dialog evaluation by finding pseudo-references. Extensive experiments against strong baselines on three benchmark datasets demonstrate the effectiveness and other desirable characteristics of our FlowEval, pointing out a potential path for better dialogue evaluation.Comment: EMNLP 2022 camera-ready versio
    • …
    corecore